An MPI failure detector over PMPI 1

نویسنده

  • Donghoon Kim
چکیده

Fault Detectors are valuable services which provide information about process failures in largescale parallel systems. Previous many studies suggest guidelines for the implementation of a fault detector. However, a practical approach to implementation is another challenge due to various parallel system environments of both hardware and software. This study explains that Fault detector is able to be transparent, scalable, and portable. The experimental results show that Fault Detector can be embedded to any MPI application with negligible overhead.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PMPI: High-Level Message Passing in Fortran 77 and C

The Message{Passing Interface (MPI) provides support for portable parallel programs, but often proves too complex to be convenient. In this paper we propose a higher{level Programmer's Message{ Passing Interface (PMPI) to the standard MPI libraries that is better suited to the needs of application programmers. PMPI largely hides the binding of message{passing routines to the programming languag...

متن کامل

Asynchronous MPI for the Masses

We present a simple library which equips MPI implementations with truly asynchronous non-blocking point-to-point operations, and which is independent of the underlying communication infrastructure. It utilizes the MPI profiling interface (PMPI) and the MPI_THREAD_MULTIPLE thread compatibility level, and works with current versions of Intel MPI, Open MPI, MPICH2, MVAPICH2, Cray MPI, and IBM MPI....

متن کامل

Practical Model-Checking Method for Verifying Correctness of MPI Programs

Formal program verification often requires creating a model of the program and running it through a model-checking tool. However, this model-creation step is itself error prone, tedious, and difficult for someone not familiar with formal verification. In this paper, we describe a tool for verifying correctness of MPI programs that does not require the creation of a model and instead works direc...

متن کامل

Implementation and Usage of the PERUSE-Interface in Open MPI

This paper describes the implementation, usage and experience with the MPI performance revealing extension interface (Peruse) into the Open MPI implementation. While the PMPI-interface allows timing MPI-functions through wrappers, it can not provide MPI-internal information on MPI-states and lower-level network performance. We introduce the general design criteria of the interface implementatio...

متن کامل

Automatic Pro ling of MPI Applications with Hardware Performance Counters

This paper presents an automatic counter instrumentation and pro ling module added to the MPI library on Cray T3E and SGI Origin2000 systems. A detailed summary of the hardware performance counters and the MPI calls of any MPI production program is gathered during execution and written in MPI Finalize on a special syslog le. The user can get the same information in a di erent le. Statistical su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009